Mining Strongly Correlated Intervals with Hypergraphs
نویسندگان
چکیده
Correlation is an important statistical measure for estimating dependencies between numerical attributes in multivariate datasets. Previous correlation discovery algorithms mostly dedicate to find piecewise correlations between the attributes. Other research efforts, such as correlation preserving discretization, can find strongly correlated intervals through a discretization process while preserving correlation. However, discretization based methods suffer from some fundamental problems, such as information loss and crisp boundary. In this paper, we propose a novel method to discover strongly correlated intervals from numerical datasets without using discretization. We propose a hypergraph model to capture the underlying correlation structure in multivariate numerical data and a corresponding algorithm to discover strongly correlated intervals from the hypergraph model. Strongly correlated intervals can be found even when the corresponding attributes are less or not correlated. Experiment results from a health social network dataset show the effectiveness of our algorithm.
منابع مشابه
Mining top-k strongly correlated item pairs without minimum correlation threshold
Given a user-specified minimum correlation threshold and a transaction database, the problem of mining strongly correlated item pairs is to find all item pairs with Pearson's correlation coefficients above the threshold. However, setting such a threshold is by no means an easy task. In this paper, we consider a more practical problem: mining top-k strongly correlated item pairs, where k is the ...
متن کاملHypergraphs and fast mining of association rules
The activities I’ve done in this past year can be summarized as follows: Nov. 2002-Jun. 2003 : I started my doctorate working in the field of Information Retrieval Jul. 2003 : I followed a two weeks summer school at Lipari focused on “Data Mining and Pattern Matching” May. 2003-now : I started to study directed hypergraph, and in particular I focused on algorithms for optimal hyperpaths. I’ll c...
متن کاملPattern Mining for General Intelligence: The FISHGRAM Algorithm for Frequent and Interesting Subhypergraph Mining
Fishgram, a novel algorithm for recognizing frequent or otherwise interesting sub-hypergraphs in large, heterogeneous hypergraphs, is presented. The algorithm’s implementation the OpenCog integrative AGI framework is described, and concrete examples are given showing the patterns it recognizes in OpenCog’s hypergraph knowledge store when the OpenCog system is used to control a virtual agent in ...
متن کاملCellular resolutions of cointerval ideals
Minimal cellular resolutions of the edge ideals of cointerval hypergraphs are constructed. This class of d–uniform hypergraphs coincides with the complements of interval graphs (for the case d = 2), and strictly contains the class of ‘strongly stable’ hypergraphs corresponding to pure shifted simplicial complexes. The polyhedral complexes supporting the resolutions are described as certain spac...
متن کاملOn Point Covers of Multiple Intervals and Axis-Parallel Rectangles
In certain families of hypergraphs the transversal number is bounded by some function of the packing number. In this paper we study hypergraphs related to multiple intervals and axis-parallel rectangles, respectively. Essential improvements of former established upper bounds are presented here. We explore the close connection between the two problems at issue.
متن کامل